rank | frequency | n-gram |
---|---|---|
1 | 70098 | -s |
2 | 51574 | -a |
3 | 50608 | -o |
4 | 38851 | -e |
5 | 16649 | -m |
rank | frequency | n-gram |
---|---|---|
1 | 25231 | -os |
2 | 18685 | -as |
3 | 13054 | -es |
4 | 11052 | -do |
5 | 10238 | -se |
rank | frequency | n-gram |
---|---|---|
1 | 7585 | --se |
2 | 5967 | -mos |
3 | 4973 | -ado |
4 | 4763 | -nte |
5 | 4551 | -dos |
rank | frequency | n-gram |
---|---|---|
1 | 3584 | -ente |
2 | 3316 | -ados |
3 | 3068 | -ação |
4 | 2748 | -ando |
5 | 2742 | -adas |
rank | frequency | n-gram |
---|---|---|
1 | 2544 | -mente |
2 | 1471 | -am-se |
3 | 1461 | -idade |
4 | 1380 | -mento |
5 | 964 | -ações |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings